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An Approach for Cluster-Based Multicast Routing in 
Large-Scale Networks 



Yibo ZHANG 1 , Weiping ZHAO n , Stiunji ABE tf , and Shoichiro ASANO ft , Members 



SUMMARY This paper addresses the optimum routing prob- 
lem of multipoint connection in large-scale networks. A number 
of algorithms for routing of multipoint connection have been 
studied so far, most of them, however, assume the availability 
of complete network information. Herein, we study the problem 
ider the condition that only partial information is available 
-outing nodes and that routing decision is carried out in a 
ibuted cooperative manner. We consider the network be- 
partitioned into clusters and propose a cluster-based routing 
:oach for multipoint connection. Some basic principles for 
-vork clustering are discussed first. Next, the original multi- 
point routing problem is defined and is divided into two types 
of subproblems. The global optimum multicast tree then can be 
obtained asymptotically by solving the subproblems one after an- 
other iteratively. We propose an algorithm and evaluate it with 
computer simulations. By measuring the running time of the al- 
gorithm and the optimality of resultant multicast tree, we show 
analysis on the convergent property with varying network cluster 
sizes, multicast group sizes and network sizes. The presented ap- 
proach has two main characteristics. 1) it can yield asymptotical 
optimum solutions for the routing of multipoint connection, and 
2) the routing decisions can be made in the environment where 
only partial information is available to routing nodes. 
key words: multicast routing, multipoint connection, large-scale 
network, clustering, global optimization, aggregate/ disaggregate 
flow 

1. Introduction 

The advent of advanced switching technologies makes 
it possible to realize efficient multipoint connec- 
tion, which has applications in many emerging 
communication-based systems, e.g., video on demand 
service, teleconferencing and distant education. Observ- 
ing that the multipoint connection may occupy much 
network resources with long duration, the transmission 
routes used for the connection have to be selected such 
that they consume minimum network resources. Tree- 
shaped routes (i.e., multicast trees) that connect the 
source and destinations usually serve as the candidates 
for such communications. 

The optimization objective for the routing of mul- 
tipoint connection is to minimize the multicast tree cost, 
which is the sum of link costs of the multicast tree. The 
minimum-cost tree is termed Steiner tree, finding such a 
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tree is known to be NP-complete. The Steiner tree prob- 
lem has first been studied in the area of graph theory. 
A variety of algorithms both for finding exactly and ap- 
proximately optimum solutions have been proposed so 
far[l], of which the heuristic algorithms are beneficial 
to real-time applications because they find approximate 
solutions within polynomial time. In the study of real- 
ization of multipoint connection in communication net- 
works, some concrete issues have been considered in the 
literature. For instance, [3] has studied the multipoint 
connection problem where destinations change during 
the life of the connection. A similar solution using 
quasi-static method based on statistical traffic pattern 
has been presented by [6]. Delay-constrained multicast 
routing has been examined by [5], [7], [4] has investi- 
gated the routing problem for multipoint multi-stream 
connection. Generating multicast trees in networks with 
directed, different characteristic links was presented in 
[8]. Instead of enumerating all of the previous work, 
our goal here is trying to find the problem which has 
been ignored ever before. Note that most of the existing 
algorithms are suitable for centralized implementation 
due to their implicit assumption that complete informa- 
tion of the network is given, they may bring with draw- 
backs of not scaling well or might even be not appli- 
cable for large-scale networks. Thus, certain techniques 
which can deal with global optimization objective effi- 
ciently under the condition of partial information are 
of necessity. In this aspect, some investigation can be 
found in [2], where, however, only shortest-path based 
multicast routing has been considered. 

In this paper, we focus on the routing approach 
which can achieve global optimum solutions in large- 
scale networks. A general routing problem requires us 
to make optimum routing decisions quickly in the cir- 
cumstance of changing traffic and resource information 
about the network. We assume that the network infor- 
mation is managed locally by individual clusters, and 
set the objective as to find global optimum multicast 
tree for multipoint connection. The idea of cluster- 
ing has been adopted in the routing of conventional 
point-to-point connection, and is known effective for 
fast routing [10], [II]. New task occurs for the rout- 
ing of multipoint connection as their objective function 
definitions differ from those of point-to-point connec- 
tion. We call the problem of finding global optimum 
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multicast trees in such an environment as cluster-based 
multicast routing (CBMR) problem. Comparing with 
its general counterparts, the CBMR is more complicated 
due to the lack of complete network information. To 
solve the CBMR problem, we first divide the problem 
into two types of subproblems, which are generated by 
separating intra-cluster and inter-cluster flows. Aggre- 
gate and disaggregate flows are used to define the sub- 
problems. Each subproblem is formulated as integer 
linear programming (ILP) problem and then be solved 
by using linear programming (LP) method. The so- 
lution is obtained in a convergent way by solving the 
subproblems one after another iteratively. Routing de- 
cisions are made in a distributed cooperative manner. 
That is, multiple routing nodes, each of which corre- 
sponds to one cluster, may be involved in the routing 
decisions. Due to the asymptotical feature in finding 
optimum solutions, trade-offs will be existing between 
the running time of algorithm and the optimality of 
the obtained solutions, and can be utilized according 
to performance requirement. We evaluate the algorithm 
with computer simulations, and show analysis on the 
convergent property with varying network cluster sizes, 
multicast group sizes and network sizes. We also com- 
pare our algorithm with one conventional algorithm. 

The remainder of this paper is organized as fol- 
lows. Section 2 provides a brief explanation on prin- 
ciples for network clustering, describes how to generate 
object graph, and specifies the general form of cluster- 
based multicast routing problem. Section 3 defines the 
problem reformulation, presents the corresponding al- 
gorithm for multicast routing. Section 4 is dedicated to 
numerical analysis with computer simulations. Finally, 
Sect. 5 shows concluding remarks. 

2. The Cluster-Based Multicast Routing 

2.1 Network Clustering 

Given a large-scale network, for general purpose, we as- 
sume the network is: (I) not regular topology, and (2) 
not fully connected. 

The network clustering is being considered accord- 
ing to the following two simple requirements: first, mak- 
ing routing decisions fast, and second, managing the 
network traffic and resource information efficiently We 
will pay more attention to the second requirement for 
the reason that enormous amount of network traffic and 
resource information need to be handled for large-scale 
and broadband networks and that multipoint connec- 
tion will probably consume large amount of network 
resources. After network clustering, the overall network 
is partitioned into small subnetworks, namely, clusters 
Hereafter, we use cluster to represent the management 
unit for network resources. The resource information 
of one cluster is available to the routing node(s) of this 
cluster, but is not available to those of any other clusters. 



Any two clusters, however, share the resource informa- 
tion of any links connecting these two clusters. 

The network are divided into clusters such that 
clusters are not overlapping, which means no node is 
shared by any two clusters. This situation can always 
be retained because a shared node of two clusters can 
be divided into two pseudo-nodes belonging to the two 
clusters respectively and being connected by^an artificial 
link with infinite capacity. The cluster size, indicated 
by the number of nodes in the cluster, is an impor- 
tant parameter which should be determined by three 
factors. The first is information storage and processing 
capacity possessed by the routing nodes. The second 
is the time limit for network information propagation 
in the interval of two continual routing decisions. The 
third is the issues stemmed from optimum routing con- 
sideration. Obviously, when cluster size is large, the 
routing nodes need to have high storage and process- 
ing capacity, the transmissions of network information 
will cost highly, and the global optimum solution of 
routing is likely to be achieved. There are two extreme 
instances that the cluster size equals to 1 or N (where N 
is the total number of nodes in the network). It is not 
straightforward to develop efficient routing approaches 
for large-scale networks under either of these instances. 
We have to exclude these instances from consideration. 

As this paper focuses on the routing method, we 
simplify the discussion of clustering method. It should 
be noted that our routing method does not impose re- 
quirement on the clustering procedure. The network 
clustering can be carried out depending on the value 
of given cluster size, the connectivity of node and/or 
the distance between nodes. A basic property that must 
be guaranteed is that no disjointed node is allowed to 
appear in any cluster. As a result of clustering, dense 
connectivity among clusters can be achieved, even for 
sparse networks. 

In the cluster-based model, one or multiple routine 
nodes of each cluster play the role of both making rout~- 
mg decisions within the cluster, and negotiating with 
their coteries of other clusters if the route will probably 
traverse those clusters. Note that each of these nodes has 
partial information of the network, i.e., it possesses only 
local information of the cluster it belongs to. Based on 
the local information, each cluster provides minimum 
cost path information to adjacent clusters, thus, each 
cluster can estimate minimum cost paths from it to other 
clusters. 

2.2 Generating Object Graph 

In a large-scale network, usually only a portion of clus- 
ters contain member(s) of certain multicast group We 
create an object graph from the original network graph 
by eliminating the clusters that do not have mufticast 
member(s) (as shown in Fig. I ). The boundary nodes 
of different clusters which were connected with the dim- 
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(a) Original network graph 




(b) Object graph 

Fig. 1 An original network graph and its object graph. 



inated clusters are connected with new links. The costs 
of new links are equal to the costs of shortest paths be- 
tween corresponding boundary nodes. A shortest path 
can be calculated by using conventional shortest path 
algorithms, i.e. Bellman-Ford algorithm [ 1 1]. The gen- 
erated object graph is used as the base for solving the 
routing problem in later sections. 

2.3 Problem Formulation 

The object graph is represented by G = { V, E), where 
V denotes the set of nodes, E denotes the set of links; 
each link e E connecting nodes i and j is asso- 

ciated with capacity of 6^. The costs imposed for unit 
flow over the links are determined by a cost function 
which is predetermined. The value of such a cost c 0 
€ E) is given and used to calculate the cost for 
a flow over certain path. In the cost structure, only 
link costs are considered while other costs such as node 
related costs (e.g., switching costs) are ignored herein. 
In the object graph, there are g clusters, denoted by 
© = {pi.P2. - - • P f/ }' Given a set of nodes, Z = Z,,}, 
where 5 is the source, Z (l is the set of destinations in the 
multicast group, and given the bandwidth, /, required 
by the connection, the optimization problem can be for- 



Min < 



subject to 



(!) 



•j€l»p.Pr€H 



i€»>r.Pr€H 



V (ij) e £\ Vk e Z rf (2a) 



{1. if i 
-1. if I 
0, othei 



v« € v,vk e z rf 

xy€{0, 1} V(i,j)eE 
4^° V UJ) € E.Vk € Z d 

*ij £ \f>ij/f\ V (ij) € E . 



— S 
= k 

otherwise 



(2b) 
(2c) 
(2d) 
(2e) 



The expression of [y\ in (2e) means the maximum inte- 
ger less than or equal to y. 

In the objective function, the first and second items 
are the sums of intra-cluster flow costs and inter-cluster 
flow costs respectively in multicast tree. Constraints (2a) 
imply that the link used by any path from the source to 
one destination must be included in the resultant tree. 
Constraints (2b) describe the transmisson condition for 
any node according to the position of this node on the 
path from the source to a destination. Constraints (2c) 
imply that any link may or may not be contained in the 
resultant tree, and (2d) describe the similar situation for 
a link as an element of the path from source to desti- 
nation. The constraints (2a) to (2d) ensure that each 
possible route is a tree which connects all the nodes in 
Z. Constraints (2e) ensure that the possible tree is con- 
stituted with those links having enough available capac- 
ity. In the above formulation, the problem is described 
as a 0-1 integer programming problem. It can also be 
viewed as a special kind of multicommodity flow prob- 
lem where each flow corresponds to a path from the 
source to one of the destinations. 

3. Solving the Global Optimum Multicast Routing 
Problem 

In this section, we present an approach to solving the 
problem described in Sect. 2.3. Because the information 
about a cluster is available to the local routing node(s) 
of this cluster, ifs generally impossible to obtain global 
optimum multicast routing solutions without negotiat- 
ing with the routing nodes of other clusters. We em- 
ploy the decomposition method for the object graph, 
and apply asymptotic technique to find multicast" trees. 
The problem is decomposed and reformulated in terms 
of aggregate flows. This is a common idea extensively 
used for solving flow optimization problems of large 
networks[!2]-[l4]. In particular, we extend the idea 
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generally used for solving minimum-cost flow problems 
to solve the optimum multicast routing problem in this 
paper. 

3.1 Aggregate Flow Based Problem Decomposition 

There are two types of flows, i.e., the intra-cluster flows 
and inter-cluster flows, existing in the object network. 
Recall that the routing node for any cluster only pos- 
sesses the information about the links in the cluster and 
the links connected with other clusters. A cluster can 
be viewed as a "node" {aggregate node) by other clus- 
ters, and the links between a node in the cluster and 
a aggregate node are then aggregated into one single 
"link" (aggregate link), as illustrated in Fig.2. In the 
?raph, a dummy node is introduced to ensure problem 
isibility and to reduce the complexity in representing 
ential links between aggregate nodes. For an object 
xph which contains m clusters, we will have m graphs 
> depicted in Fig.2. w * 

Additionally, flows between two clusters can be ob- 
served independently by splitting these clusters off as 
illustrated in Fig. 3. The links between two clusters 
are intended to accomodate the possible aggregate flows 
on the aggregate links in Fig. 2. These links are direc- 
tional with the direction from the left cluster to the right 
one. Thus, two such graphs exist for any two connected 
clusters. The total number of such graphs depends on 
the connectivity among clusters. Note that only those 
nodes sharing link(s) with some node(s) in the other 
cluster are contained in the graph. Two dummy nodes 




Fig. 2 A graph with one cluster and aggregate nodes. 





Pi P 0 
Fig. 3 Two graphs with two connected clusters. 



are added, each of which connects bidirectionally with 
nodes in one of the two clusters. 

3.2 Problem Reformulation 

Two categories of subproblems can be derived from the 
above decomposition. They correspond to the two types 
of graphs shown in the previous subsection. The first 
type of subproblem is to solve the routing problem in 
one cluster and the associated aggregate nodes while 
the second type of subproblem is to solve the routing 
problem between clusters. We use P x and P> to iden- 
tify two types of subproblems respectively and eive their 
formulations later. 

Prior to proceeding, let us assign the to-be-used no- 
tations as follows, (those having been used previously 
are not repeated) 

Vip./. Flow on aggregate link (i,p q ) 
y PqJ : Flow on aggregate link (p q J) 

Vrq- {y lPq ,yp r j\i G p r J £ p q } 

u iPq : Cost for unit flow on aggregate link (Lp q ) 

u Pqt : Cost for unit flow on aggregate link (p q A) 

u r {(uip v .tt P ,i)|t €p r ,9=l,2,-..,m,^r} 

d r : Dummy node for the clusters that are connected 
with cluster r in subproblem P l 

d l rq : Dummy node connected with cluster r in sub- 
problem P 2 

d* q : Dummy node connected with cluster q in sub- 
problem P 2 

v tPq : Dual variable associated with aggregate link 

(i,p q ) 

v Pfji : Dual variable associated with aggregate link 

(P q ,i) 

t' d . f : Dual variable associated with dummy node d l 

v dlj : Dual variable associated with dummy node d; q 

w tJ : Dual variable associated with the flow on link 
between two clusters 

w i(l t Dual variable with link [i,dl q ) 

w <tl ti j : Dual variable with link (d% r j) 

c {cij\ij £ p r .r = 1,2, • -,m} 

x: {*zj\iJ € p r . r = 1, 2, • ■ • , m) 

v: (to* • ''/>,. W 6 Pr \ r, q = 1. 2, • • • . m: r ± q ) 

V {(Vi^^yp u i)\i € p r :r,<7 = 1.2. =fc q) 
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w: {(wij\i £ p r ,j e P q \r,q = 1.2,- -,m;r 4= 
€ p r ,7 € p^r 4= q;r,(7 = 

1,2,- .-,m)} 

6: {(L fe ij//JI* e Pr.3 € p,,;r ?7 = 1,2, - -,m;r 4= 
6 6 P q \r,q = 1, 2 t ■ • • , m); r =f= g} 

p r — 9 : The set of boundary nodes in cluster p r con- 
nected with cluster p q 

p, ; _^ r : The set of boundary nodes in cluster p q con- 
nected with cluster p r . 

v q' {(vi Pr .v Pri )\i G p q ,r = 1,2, -,m,r =(= q} 

«V {(wiA* ^ p 9 ,j € p r :r = 1.2,---,m;r 4= 
<j), (iL-.ji^iL^jli € p,,,j e p r ,r 4= g;r = 

*r {XijI'J 6 p r } 

Vr: {(y. P „,y„„i)l* € p r ;<?= 1,2, • ? m;<? 4= r} 



*>£/u>* ,i r , y': Variable sets associated with v q .w q , 
x r ,y r respectively (indicating the results of the 
£-th iteration of the routing algorithm) 



v< ? (i),-u> q (r.) t x r (£), y r {t): Variable sets associated with 
v q , u> q ,x r ,y r respectively (indicating the results 
of the first t iterations of the routing algorithm). 

Suppose u r is given, the objective function of P\ is 
to minimize the sums of intra-cluster flow costs of clus- 
ter r and the inter-cluster flow costs on aggregate links 
between the nodes of cluster r and other clusters. 



Pi{r,u r ) : 



Min < 



(3) 



subject to 



x iy ^ x£ + x£ ViJ € p r , (ij) € £\ VA; € Z d (4a) 
y iPu ^ y*^ V?: 6 p r . A, € 0 \ {p r }. VA: € Z,, (4b) 
2/p„, ^ y*,j Vi € p r -p q € 0 \ {p r }, VA € Z d (4c) 



(e< 




+ E 

+ E <•] = (-^ * * = ' 

p.,ee\(/v> / ( 0. otherwis 



<4d) 




f 



1, if sep (J .k$p q 
= { -1, if A € p q? s $ p q 
0, otherwise 



0 g>x tj £1 



V* € Z d ;p q € 0\ {p r } 

Vi, j 6 Pr, (i, j) € £ 
0^y iPq g 1 Viepr,p g €e\{Pr} 
0 ^ </ Pf?l ^ 1 Vi € p r ,p„ € 0 \ {p r } 
*0 g L6 0 //J Vi, j € p r , (t\i) € F 
All variables are non. negative integers. 



(4e) 
(4f) 
(4g) 
(4h) 
(4i) 
(4j) 



The constraints (4a) to (4h) ensure locally that a 
possible route is a tree which connects the nodes in Z. 
Constraints (4a) to (4c) are extention of (2a). Con- 
straints (4d) and (4e) are extension of (2b). (4f) cor- 
respond to (2c) while (4g) are new constraints specific 
to aggregate links. (4i) guarantee that only those links 
with enough capacities can be used to accomodate the 
given flow. Pi is a mixed integer programming prob- 
lem in terms of multicommodity flows, where each flow 
corresponds to an embeded path in the tree from the 
source to one of the destinations. 

Given clusters r,q % and aggregate flows y rq between 
these clusters, the objective function of P^ is to mini- 
mize the sums of the flows costs on the links between 
cluster r and cluster q and the flow costs on articifial 
links between cluster r or q and the dummy node asso- 
ciated with it. The unit flow costs on the articifial links 
are assigned large positive integer values, implying that 
the given flows from one node to the corresponding ag- 
gregate node are preferred to passing through the links 
existing between them, rather than traversing any other 
intermediate nodes in these clusters. 

Min < C V X H + M x I ( X "C„ + x <*.v) 

j € /»., — x 



(5) 
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subject to 



E^ Xi> + x d?|J - Xi<{% = ^ Vj € p<? _ r(6b) 

(6c) 
(6d) 



0 J 1 v, € „ r _„.v,- e ^ Pf(ifi) e E (6e) 

0 £ x lrf ,, * , w e ^ (6f> 
0 g £ 1 «f g Pr _, 

0 g ^ J Wj p 

O^x^.^i V J€A? _ (6j) 

*« ^ t*o//J Vi € Pr _ 9! Vj - g (fti) € F(6J) 

AH vanables are nonnegative integers. (6k) 

*reJL q fl ati ° nS (6a) r and (6b) ensure that ^e given ag- 
gregate flows are enforced on the relevant lintc v 

the flows on artificial links are also one as ~£?a- 
constraints (6f) to (6i). ' S s P ecified "> 

The original CBMR problem can rhon r 
mu.ated based upon these subpr^.ems The objSuve 
funcuon of the reformulated problem can be writJeT" 



r v(<7*r) 



(7) 

S('%7w r ' 9),s ' r * ) denotes the optimum so,uii ° n ° f 

(>Lp[prob. r em' e Th " T ^ P ro S^mming 

solution of the fLP oroM b ' em u T st not th ' optimum 
the bound of the ,aue r ' em - foemer ™ be « 

Now we relax this problem hv a 
programming relaxations of P, a „ d T° duCm S ,inear 
£P 2 . which are obtained by remote 2; i' * " P ' a " d 
srramts (4j) and (6k) wi^^^S^^ 
not change the tntegra. property "of solut^a cordTn 
to the following principle shown in [15]. ccord,n 8 
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Integral Property. For minimum cost fl OW problems 
hanng mteger bounds and demands a l ba- 

Z Va " ableS (includin * optimum one also 
have integer values 

+ E E^TC^).*,)}. 

oflp^il r ; q f Vrq) denotes ,he op,imum so,ution 

Using the dual of LP,, i e DTP u,h,>h • , 

Min < V c x 

+ E E ^ Pt ((r, 9 ),y r ,)l, 

<"S„ (<: -* + „^ 2w (»-» + »-6)), 
that is, 

SSiSSSL? (TOS " abb - ia ' i - of The 



(ia£feo ( ,2SSo (C ' * + U ■ y ) + w ■ ft ) = TOS 



and 



( "So ( « t 2 «fe,/ W • f + » • *) + c • x) = TOS 



That is. 



and 



,^ t) ^E^'T((r.,). lfr , ) + e .,j =TOS 



(9) 
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According to (8) and (9), it is possible to find the 
lower and upper bounds for the optimum solution. By 
diminishing the gap of these bounds, the optimum so- 
lution can be obtained asymptotically. 

3.3 Routing Algorithm 

In this section, we present an algorithm for CBMR 
based on the problem reformulation described previ- 
ously. According to (8) and (9), our task is to find 
the lower and upper bounds for the optimum solution. 
By reducing the gap of the two bounds, an asymptoti- 
cal optimum solution can be obtained. In particular, if 
the gap becomes zero, it means the optimum solution 
has been found. The algorithm consists of three phases: 
the first phase is to determine the routing nodes to be 
involved in making routing decisions, and to initialize 
some variables, the second phase is to use an optimiza- 
tion loop to get the optimum solution, the third phase 
is to yield the final solution. As an input parameter, e 
is given to stop running of the algorithm (i.e., to exit 
the optimization loop). 

The algorithm is realized in a distributed cooper- 
ative manner. A center node needs to be selected to 
collect local computation results and do calculations to 
obtain the lower and upper bounds. During the run- 
ning of the algorithm, the costs of aggregate links are 
updated to reflect any decision made by local routing 
nodes. The updated costs are notified to the routing 
nodes of the corresponding clusters. For instance, v Prj 
( v *p» ) ' s updated by the routing node of cluster r (q) and 
its new value is sent to the routing node of the cluster 
to which j (z) belongs. The new values are then used 
in making routing decisions in the next iteration. 

In the algorithm, we discriminate clusters using the 
following terms. 

Source cluster: A cluster where the source node of 
the multicast group is contained. 

Destination cluster: A cluster which contains only 
destination nodes of the multicast group. 

Given B = {pi,P2» • • ■ , Pm} (the set of clusters 
which contain multicast members), we suppose p x rep- 
resent the source cluster and the others represent desti- 
nation clusters. 

Algorithm: 

Phase 1 - Initialization 

(1) Determine the center node and the routing nodes 
(each routing node is selected from a cluster in B. 
We can determine the routing node in the source 
cluster as the center node). Confirm the neighbor- 
hood among these nodes and establish necessary 
communication channels between them. Set aggre- 
gate link costs u if)it = 0, « /)f|i = oc (V* 6 p r -Vp r € 
0. and \fp q € B \ {p r }). 

(2) In the source cluster, solve LP^l.tiJ. This yields 



a tree among the multicast member nodes in the 
source cluster and its adjacent aggregate nodes. 
The aggregate link costs G p q ,p, { € B\ 

{pi}) are obtained by calculating the minimum- 
cost path between j and the resultant tree. Send 
u Pl j to the routing node of the cluster to which j 
belongs. 

(3) In destination cluster q (q = 2, 3, • • • , m), update 
u Pr j{pr € 0 \ [Pq}*j € p q ) if new value was re- 
ceived from the routing node of cluster r. If there 
is a link between node j of cluster q and node k of 
cluster p (p p € B \ {p r ,Pq}), let u^t = + <> 
and send it to the routing node of cluster p. Solve 
LPiiq.Ug). This yields (x£,y°). Send the elements 
of y q to the routing nodes of related clusters. Let 
x,(0) =x°,y q {0) =i/£. 

(4) Solve DLP 2 {{r,q),y rq ) in the routing node of clus- 
ter q if y rq 4= o. This yields dual extreme solutions 
(v^ q ,w^). Send the non-zero elements of to 
the routing node of cluster r, vu* q to the center 
node. Set v r (0) = o.w r (0) = o. In the center node, 
set iteration number t = 1, LB(Q) = — do. UB(0) — 
oc. 

Phase 2 - Optimization Loop 

(5) In the source cluster, solve LP X {\, v x (t — 1)). This 
yields solution (x',y*). The costs u PU (Vj € 
Pq,P q G 0 \ {pi}) are updated by calculating the 
minimum-cost path between j and the resultant 
tree. Send u Pl j to the routing node of the clus- 
ter to which j belongs. 

(6) In destination cluster q (q = 2,3, •■■,m), if non- 
zero elements exist in v q or iv q , update vectors 
v q (t) and iv q (t) as follows (only for those elements 
while the corresponding v q *s or iv q *s elements are 
not zeros). 

v q (t) = ((t - l)/t)v q (t - 1) + (l/t)v q , 

w q (t) = at - i)/t) Wq (t - 1) + (i/t)w q . 

Solve LPi(r,v q (t)). This yields (x^, y*). Send y q 
to the routing nodes of related clusters. The costs 
u P q j(VJ € p r ,Pr e & \{p q }iy Pq j = 1) are obtained 
by calculating the minimum-cost path between j 
and the subtree consisting of non-zero elements of 
x* Send c x q and H tf , WJ=1 Up*jV M lo tne cen * 
ter node. In the above, v q and iv q are set to be 
the sums of their previous values multiplied by the 
numbers of their occurrence (utilizing the informa- 
tion obtained so far). Through this approach, the 
values of the vector elements get closer to the values 
(stable points) which occurred most frequently. 
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(7) In clustery solve DLP 2 ((r,qhy^) to obtain dual 
vectors lv%*, w %*). Send the non-zero elements 
of v tq* to the routing node of cluster r, w l T +1 to 
the center node. r<? 

(8) In the center node, update the lower bound for the 
optimum solution as 



LB(t) 



LB{t 



while the upper bound for the optimum solution 
is updated as 

UB(t) 

= Min | D ^ pt ((r, o). y rq ) + c • x j , 

t/£?(< - 1) J , 
wherec.x = 5:,(c.x^ + u^^). 

Set x r (t) = x<, yr?(j) = y t {Vr q € e r + an<J 

« =« + l irt/B(t)<tfS(*-l). 
If J(^fl(<) - LB(t))/UB{t)\ > e and LB(f) =t 
LB(t - 1), go to step (5). Otherwise, send a stop 
message to other routing nodes. 

Phase 3 - Ending 

(9) Constitute subgraphs with links of non-zero el- 
ements of x r (r) and non-zero disaggregated ele- 
ments of y r ,(t). This produces the final solution 
of the problem. 

In the algorithm, the dual variable vector v Q rep- 
resents the costs between multicast members in different 
clusters. The minimum values of v v 's elements equal to 
•he costs of inter-cluster links. For instance, the min- 
imum value of ^ {i e PqJ 6 prtq + r) equa)s tQ 
c,j wnne nodes i,j are multicast members which can 
whiu"?,?*? Wkh a direCt link between Mean- 
cos v a j! s e 'r me T ? Vq haVC UPPCr bOUnds since the 
cost values of paths between one node to any multicast 

members ,n neighboring clusters are bounded 
First^nlainTh' 6 ^ 15 re,ating with the al S° rith ™- 

dual variable, of ■„ . b ^ l^ZT^? 
n other words, the „em «, . 6 in LB U ) setting w ™ ! not 
larger than zero. We only need to be the aware of ,u 
first item. £ r LP^r, v r ). The lower bound expressed 
like this also reflects the fact that the individual local 
solutions do not produce a global tree. On the other 



hand, from the setting of UB(t), the upper bound here 
<mpl,es a global tree because it connects all the multi- 
cast members together. It is a feasible solution of the 
original problem. Thus, the final solution can be ob- 
tained from the upper bound. 

N e«. we describe the complexities of the algorithm 
Before that, we have a proposition below. 

Proposition I: The routing algorithm stops within 
limited iterations if the link costs are rational numbers 
Proof: As stated previously, the values of v's el- 
ements are bounded between certain numbers that are 
related with the spectrum of link costs. We assume 
that the objective function takes values havine the same 
number of decimals with the link costs. We chanee v 

r a ^? rdi M l ° VqW u ({t ~ 1)/t)v *l* ~ » + Ttera*- 
tively. Moreover, the total number of elements is' finite 

TPnu! ro^° r e ' ememS Ca " reach ,neir s,able Points 
(LV(t) = LB{t - 1)) within limited iterations. O 

Let n denote the cluster size (number of nodes in a 
cluster), m the number of clusters in the object graph u 
the maximum connectivity (maximum number of links) 
o\ a node, v the maximum connectivity (maximum num- 
ber of links) between two clusters, p the number of 
multicast members. According to [16], the computa- 
tional complexities of LP, and DLP 2 are 0(n*p*(u + 
m) lognp(p +m )) and0(l/ 4 logt/)respect . vely S J 

m.n.mum-cost paths requires 0(n 2 ). They have smaller 
orders than LP,. Because the algorithm stops within 
limited aerations, the computational complexity of the 
algorithm can be counted by the complexity of one it- 
eration. ..e., 0(«V(M + m) 4 log np( M + m )). 

As a measure indicating the amount of messages 
transferred between routing nodes, the communication 
complexity is 0{mn). 

Proposition 2: The routing algorithm is convergent 

,RWh It , ,s . ensured b y se "i"g UB{t) as in step 

(8) tha the solution obtained will descend. As Propo- 
sition I says, the algorithm stops within limited itera- 
'°" S a Whe " = U*{t - 1), it is not possible to 

increase LB any more. In other words. LP, reached 
stable points. We use step (8) to enforce the algorithm 
to select the best one from possible stable points" set 
The algorithm yields optimum or near-optimum solu- 
tions. 

□ 

4. Numerical Results 

4.1 Network Model 

Networks with mesh topologies are chosen to be stud- 
ied in the simulation. We adopt a method similar to 
hat used in [3] to generate the topology. For .V nodes 
that are randomly located on cross points of a rectan- 
gular grid, the Euclidean distances are used to represent 
the distances between node pairs. A link is introduced 
between node pairs depending on the link probability 



ZHANG ct ix\: AN APPROACH FOR CLUSTER-BASED MULTICAST ROUTING 



1037 



Of 



(10) 



whereof is the distance between nodes i and j. D is the 
maximum distance between two nodes in the graph. 0 
is the parameter determining the density of the graph, a 
is used to adjust the connectivity of a node with other 
nodes; while a gets small, short links get dense com- 
pared with long links. 6 is the mean node connectivity. 
y is related with a and d. For instance, when a = 0.25, 
0 = 0.47, -y is approximately 12. 

The available capacity of each link is assigned ac- 
cording to a uniform random distribution in the range 
of (0, B]. Link cost is set to the inverse number (letting 
the value have three decimals) of available capacity of 
that link. 




10 



2 3 4 5 6 7 
Number of Iterations 

Fig. 4 Convergence on iterations <;V = 120, a = 0.25. Q = 0 47. 
y = 12. 6 = 8). 



4.2 Simulation Results 

We consider networks constituted of comparatively 
large number of nodes. In the experiment, we set 
S = 8, B = 10. Suppose the capacity required by the 
multicast connection is 1. The multicast connection 
needs to be established among the nodes of a multi- 
cast group, which are selected randomly out of the net- 
work. Clusters are generated following the principles 
described in Sect. 2 when the value of cluster size is 
given. After multicast member nodes are given, the ob- 
ject graph can then be determined. 

The linear programming subproblems are solved 
by an LP solver called Ipsolve. The simulations are 
carried out on Sun SS-20. Four main issues are in- 
vestigated through the simulation. All simulations are 
conducted with multiple cases and the resultant data are 
taken from their averages, e is set to be 0.05. We should 
set it properly (not extremely small) in order to toler- 
ate the cumulative d eviation caused by the operation 
(round-off) on variables of v. 
(a) Convergence 

As an example, we let N = 120, and let the cluster 
size vary in the way that just divides the network uni- 
formly. Figure 4 shows the convergence of the solutions 
while the algorithm is executed. 

The solution obtained at each iteration is compared 
with the optimum solution, which is obtained by using 
an enumeration method. The cost ratio is denned as the 
cost of the resultant multicast tree to that of the opti- 
mum, indicating the optimality of a solution. The data 
are obtained when the multicast group size (number of 
nodes in a multicast) is set to be 6. 

From Fig. 4 (The numbers in brackets after the clus- 
ter sizes indicate the mean connectivity between clus- 
ters), we can see that the resultant solutions are getting 
near to the optimum while the number of iterations in 
creases. The fact of asymptotical feature can be ob- 
served from the trend of lines in the figure. We should 



note that, however, the algorithm will not definitely 
yield the optimum solution because in the algorithm 
round-off operations are needed for the variables of v. 

(b) The relation between solution optimality and algo- 
rithm's running time 

Sometimes, the number of iterations is not obvi- 
ous for us to determine the efficiency of the algorithm. 
Moreover, under the circumstance of limited running 
time, the corresponding running time is the most ap- 
propriate parameter determining the efficiency. 

Figure 5 shows the relation between solution op- 
timality and algorithm's running time. The parameters 
used are the same with the above. In each iteration, the 
time needed to run the algorithm is measured. The times 
running concurrently executable steps are not added to- 
gether and only the maximum one is taken account of. 

The efficiency of the algorithm depends on appro- 
priate cluster size. According to the structure of the al- 
gorithm, it can be stopped within given time. This will 
result in approximately optimum solutions. In each iter- 
ation, we measure the time needed to run the algorithm, 
including the communication time needed to exchange 
messages between routing nodes. We set the time for 
transmitting one message between two routing nodes is 
0.1 second (same afterward) herein. 

(c) Various network sizes 

The scalability of the algorithm is investigated by 
changing the total number of nodes in the network (net- 
work size). Figure 6 shows the running time needed for 
various number of nodes. Four kinds of network sizes 
60, 100. 300, 1000 are tested. The experiment is con- 
ducted by setting the multicast group size to be 6, 10. 
20 and 30 respectively. 

In the figure, when the network size becomes large, 
only moderate increase of running time was observed. 
Also, there was no fast increase in the running time 
while the multicast group size gets large. Along with 
the complexity analysis in Sect. 3.3. we can anticipate 
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that this approach is efficient for large networks, 
(d) Comparison with other method 

Since our approach finds optimum multicast trees 
asymptotically, it yields solutions efficiently. We com- 
pare it with an algorithm called Steiner Tree Enumera- 
tion Algorithm (STEA) [I], which is considered to be 
efficient for the routing of multipoint connection in a 
small local network. In our experiment, the multicast 
trees among clusters are previously searched using a tree 
enumeration algorithm, and then STEA is used to find 
local solutions within clusters. 

Figure 7 shows the comparison of time efficiency 
by using the two algorithms under the same conditions 
The network size is set to be 200 and multicast group 
size is 10. Cluster size is set to be 5, 10, 20, 30 and 
50 respectively. Since in our approach the lower and 
upper bounds are obtained based on aggreeate and 
disaggregate technique (ADT), we can determine that 
near-optimum solution is obtained once the gap of the 
bounds gets small enough. However, by STEA we can 
not do like this. We have to search over all the cases 
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Fig. 7 Comparison with STEA (AT = 200). 



SO 



first, then make decisions. The difference appears ap- 
parently, espeaally when the cluster size gets large In 
addition, since we set c~ to be small(0.05), our algorithm 
achieved results with the cost ratio to optimum below 
Th,s turns out th at the algorithm has good opti- 
mization performance as well. 

4.3 Summary 

The simulation results show that solution convergence 
can be achieved by the presented algorithm. It yields 
near-optimum solutions within moderate running time 
Meanwhile cluster size will influence the efficiency of 
the algorithm. It suggests that while using the algo- 
rithm the cluster size should be appropriately selecfed 
so as to obtain the solution in an efficient way. Scalabil- 
ity can be realized while network s,ze becomes large and 
the multicast group SI2 e is relatively large Finally the 

SS^ST^ Sh ° WS hi8her ^e'efficiency^han 
i 1 1 A in finding optimum solutions 

mi JT'? 1 d J fferC . m netWOrk to Po'°gies. which deter- 
m.ned by the values of a, 0 and 6, the efficiency in 
finding so.ut.ons will change somewhat. We omk tne 
deta.ls on this aspect, and leave this kind of inve tiga! 
t.on as a work in studying practical systems. § 

5. Conclusion 

In this paper we proposed the CBMR approach for hi- 
erarchy! mu.t.cast routing. In our scheme, as a cond - 
t.on. only incomplete information is available to rout- 
.ng nodes. The objective is to find the asymptotical op" - 
mum solution for routing of multipoint connection X 
have shown by computer simulation that the routine ap! 
proach y.elds reasonable efficiency in terms of runn n e 
t.meand solution optimally. The routine approach can 
comprom.se the appli cation req ui re mem b ^ d on he 
rade-offs ex.st.ng between the runmng time and the so 



ZHANG ct jI: AN APPROACH FOR CLUSTER-BASED MULTICAST ROUTING 



1039 



The presented approach has two main characteris- 
tics, I) it yields asymptotical optimum solutions for the 
routing of multipoint connection, 2) the routing deci- 
sions can be made in the environment where only partial 
information is available to routing nodes. These charac- 
teristics are beneficial to multicast routing in large-scale 
networks. Some extension should be made if the ap- 
proach is required to deal with multicast routing prob- 
lems where additional requests such as when delay con- 
straints are added. On the other hand, while the re- 
quirement of time efficiency is more important than that 
of the solution optimally, some other techniques such 
as heuristics would be required in finding approximate 
solutions. 
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Appendix 

Given an LP problem in the form of 

Min c T x 
subject to 

Ax = b 

X ^ o . 
Its dual takes the form of 

Max rj> T b 
subject to 

A T xl> g c . 

According to this, by adding slack variables into 
inequalities (6e) to (6j), we can dualize LP 2 as follows. 

DLP 2 ({r i q) J y rq ) : 

MaX < V iP<,y*P<, + ^2 V PrjVPrj 



■ €*». q J 



subject to 



»tf> q 



+ V d i + W id l g M Vt e Pr~q 



(A- I) 



(A- 2a) 
(A- 2b) 
(A- 2c) 



- v Pr-j + v d* q + w di q j g A/ V j € p q -.r (A- 2d) 
<Vi - v*», + w dlqj g M Vj e p q -~r (A- 2c) 
Wij $0 Vz e Pr-„ Vj € P fl -r, (iJ) € E (A- 2f) 



tX7 ld L q gO Vl 6 Pr — q 
W dl Q j gO Vj € P q ~r 



(A- 2g) 
(A-2h) 
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